Goto

Collaborating Authors

 development and validation


Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm

Lim, Yooseok, Park, Inbeom, Lee, Sujee

arXiv.org Artificial Intelligence

Appropriate medication dosages in the intensive care unit (ICU) are critical for patient survival. Heparin, used to treat thrombosis and inhibit blood clotting in the ICU, requires careful administration due to its complexity and sensitivity to various factors, including patient clinical characteristics, underlying medical conditions, and potential drug interactions. Incorrect dosing can lead to severe complications such as strokes or excessive bleeding. To address these challenges, this study proposes a reinforcement learning (RL)-based personalized optimal heparin dosing policy that guides dosing decisions reliably within the therapeutic range based on individual patient conditions. A batch-constrained policy was implemented to minimize out-of-distribution errors in an offline RL environment and effectively integrate RL with existing clinician policies. The policy's effectiveness was evaluated using weighted importance sampling, an off-policy evaluation method, and the relationship between state representations and Q-values was explored using t-SNE. Both quantitative and qualitative analyses were conducted using the Medical Information Mart for Intensive Care III (MIMIC-III) database, demonstrating the efficacy of the proposed RL-based medication policy. Leveraging advanced machine learning techniques and extensive clinical data, this research enhances heparin administration practices and establishes a precedent for the development of sophisticated decision-support tools in medicine.


Development and Validation of a Machine Learning Algorithm for Clinical Wellness Visit Classification in Cats and Dogs

Szlosek, Donald, Coyne, Michael, Riggot, Julia, Knight, Kevin, McCrann, DJ, Kincaid, Dave

arXiv.org Artificial Intelligence

Early disease detection in veterinary care relies on identifying subclinical abnormalities in asymptomatic animals during wellness visits. This study introduces an algorithm designed to distinguish between wellness and other veterinary visits.The purpose of this study is to validate the use of a visit classification algorithm compared to manual classification of veterinary visits by three board-certified veterinarians. Using a dataset of 11,105 clinical visits from 2012 to 2017 involving 655 animals (85.3% canines and 14.7% felines) across 544 U.S. veterinary establishments, the model was trained using a Gradient Boosting Machine model. Three validators were tasked with classifying 400 visits, including both wellness and other types of visits, selected randomly from the same database used for initial algorithm training, aiming to maintain consistency and relevance between the training and application phases; visit classifications were subsequently categorized into "wellness" or "other" based on majority consensus among validators to assess the algorithm's performance in identifying wellness visits. The algorithm demonstrated a specificity of 0.94 (95% CI: 0.91 to 0.96), implying its accuracy in distinguishing non-wellness visits. The algorithm had a sensitivity of 0.86 (95% CI: 0.80 to 0.92), indicating its ability to correctly identify wellness visits as compared to the annotations provided by veterinary experts. The balanced accuracy, calculated as 0.90 (95% CI: 0.87 to 0.93), further confirms the algorithm's overall effectiveness. The algorithm exhibits strong specificity and sensitivity, ensuring accurate identification of a high proportion of wellness visits. Overall, this algorithm holds promise for advancing research on preventive care's role in subclinical disease identification, but prospective studies are needed for validation.


Broadband Ground Motion Synthesis via Generative Adversarial Neural Operators: Development and Validation

Shi, Yaozhong, Lavrentiadis, Grigorios, Asimaki, Domniki, Ross, Zachary E., Azizzadenesheli, Kamyar

arXiv.org Artificial Intelligence

We present a data-driven model for ground-motion synthesis using a Generative Adversarial Neural Operator (GANO) that combines recent advancements in machine learning and open access strong motion data sets to generate three-component acceleration time histories conditioned on moment magnitude ($M$), rupture distance ($R_{rup}$), time-average shear-wave velocity at the top $30m$ ($V_{S30}$), and tectonic environment or style of faulting. We use Neural Operators, a resolution invariant architecture that guarantees that the model training is independent of the data sampling frequency. We first present the conditional ground-motion synthesis algorithm (referred to heretofore as cGM-GANO) and discuss its advantages compared to previous work. Next, we verify the cGM-GANO framework using simulated ground motions generated with the Southern California Earthquake Center (SCEC) Broadband Platform (BBP). We lastly train cGM-GANO on a KiK-net dataset from Japan, showing that the framework can recover the magnitude, distance, and $V_{S30}$ scaling of Fourier amplitude and pseudo-spectral accelerations. We evaluate cGM-GANO through residual analysis with the empirical dataset as well as by comparison with conventional Ground Motion Models (GMMs) for selected ground motion scenarios. Results show that cGM-GANO produces consistent median scaling with the GMMs for the corresponding tectonic environments. The largest misfit is observed at short distances due to the scarcity of training data. With the exception of short distances, the aleatory variability of the response spectral ordinates is also well captured, especially for subduction events due to the adequacy of training data. Applications of the presented framework include generation of risk-targeted ground motions for site-specific engineering applications.


Development and validation of an interpretable machine learning-based calculator for predicting 5-year weight trajectories after bariatric surgery: a multinational retrospective cohort SOPHIA study

Saux, Patrick, Bauvin, Pierre, Raverdy, Violeta, Teigny, Julien, Verkindt, Hélène, Soumphonphakdy, Tomy, Debert, Maxence, Jacobs, Anne, Jacobs, Daan, Monpellier, Valerie, Lee, Phong Ching, Lim, Chin Hong, Andersson-Assarsson, Johanna C, Carlsson, Lena, Svensson, Per-Arne, Galtier, Florence, Dezfoulian, Guelareh, Moldovanu, Mihaela, Andrieux, Severine, Couster, Julien, Lepage, Marie, Lembo, Erminia, Verrastro, Ornella, Robert, Maud, Salminen, Paulina, Mingrone, Geltrude, Peterli, Ralph, Cohen, Ricardo V, Zerrweck, Carlos, Nocca, David, Roux, Carel W Le, Caiazzo, Robert, Preux, Philippe, Pattou, François

arXiv.org Artificial Intelligence

Background Weight loss trajectories after bariatric surgery vary widely between individuals, and predicting weight loss before the operation remains challenging. We aimed to develop a model using machine learning to provide individual preoperative prediction of 5-year weight loss trajectories after surgery. Methods In this multinational retrospective observational study we enrolled adult participants (aged $\ge$18 years) from ten prospective cohorts (including ABOS [NCT01129297], BAREVAL [NCT02310178], the Swedish Obese Subjects study, and a large cohort from the Dutch Obesity Clinic [Nederlandse Obesitas Kliniek]) and two randomised trials (SleevePass [NCT00793143] and SM-BOSS [NCT00356213]) in Europe, the Americas, and Asia, with a 5 year followup after Roux-en-Y gastric bypass, sleeve gastrectomy, or gastric band. Patients with a previous history of bariatric surgery or large delays between scheduled and actual visits were excluded. The training cohort comprised patients from two centres in France (ABOS and BAREVAL). The primary outcome was BMI at 5 years. A model was developed using least absolute shrinkage and selection operator to select variables and the classification and regression trees algorithm to build interpretable regression trees. The performances of the model were assessed through the median absolute deviation (MAD) and root mean squared error (RMSE) of BMI. Findings10 231 patients from 12 centres in ten countries were included in the analysis, corresponding to 30 602 patient-years. Among participants in all 12 cohorts, 7701 (75$\bullet$3%) were female, 2530 (24$\bullet$7%) were male. Among 434 baseline attributes available in the training cohort, seven variables were selected: height, weight, intervention type, age, diabetes status, diabetes duration, and smoking status. At 5 years, across external testing cohorts the overall mean MAD BMI was 2$\bullet$8 kg/m${}^2$ (95% CI 2$\bullet$6-3$\bullet$0) and mean RMSE BMI was 4$\bullet$7 kg/m${}^2$ (4$\bullet$4-5$\bullet$0), and the mean difference between predicted and observed BMI was-0$\bullet$3 kg/m${}^2$ (SD 4$\bullet$7). This model is incorporated in an easy to use and interpretable web-based prediction tool to help inform clinical decision before surgery. InterpretationWe developed a machine learning-based model, which is internationally validated, for predicting individual 5-year weight loss trajectories after three common bariatric interventions.


Large Language Models for Granularized Barrett's Esophagus Diagnosis Classification

Kefeli, Jenna, Soroush, Ali, Diamond, Courtney J., Zylberberg, Haley M., May, Benjamin, Abrams, Julian A., Weng, Chunhua, Tatonetti, Nicholas

arXiv.org Artificial Intelligence

Diagnostic codes for Barrett's esophagus (BE), a precursor to esophageal cancer, lack granularity and precision for many research or clinical use cases. Laborious manual chart review is required to extract key diagnostic phenotypes from BE pathology reports. We developed a generalizable transformer-based method to automate data extraction. Using pathology reports from Columbia University Irving Medical Center with gastroenterologist-annotated targets, we performed binary dysplasia classification as well as granularized multi-class BE-related diagnosis classification. We utilized two clinically pre-trained large language models, with best model performance comparable to a highly tailored rule-based system developed using the same data. Binary dysplasia extraction achieves 0.964 F1-score, while the multi-class model achieves 0.911 F1-score. Our method is generalizable and faster to implement as compared to a tailored rule-based approach.


Detecting intimate partner violence circumstance for suicide: development and validation of a tool using natural language processing and supervised machine learning in the National Violent Death Reporting System - PubMed

#artificialintelligence

Background: Intimate partner violence (IPV) victims and perpetrators often report suicidal ideation, yet there is no comprehensive national dataset that allows for an assessment of the connection between IPV and suicide. Objective: To facilitate a more comprehensive understanding of the co-occurrence of IPV and suicide, we developed and validated a tool that detects mentions of IPV circumstances (yes/no) for single suicides in NVDRS death narratives. Methods: We used 10 000 hand-labelled single suicide cases from NVDRS (2010-2018) to train (n 8500) and validate (n 1500) a classification model using supervised machine learning. We used natural language processing to extract relevant information from the death narratives within a concept normalisation framework. We tested numerous models and present performance metrics for the best approach.


Learning structures of the French clinical language:development and validation of word embedding models using 21 million clinical reports from electronic health records

Dura, Basile, Jean, Charline, Tannier, Xavier, Calliger, Alice, Bey, Romain, Neuraz, Antoine, Flicoteaux, Rémi

arXiv.org Artificial Intelligence

Background Clinical studies using real-world data may benefit from exploiting clinical reports, a particularly rich albeit unstructured medium. To that end, natural language processing can extract relevant information. Methods based on transfer learning using pre-trained language models have achieved state-of-the-art results in most NLP applications; however, publicly available models lack exposure to speciality-languages, especially in the medical field. Objective We aimed to evaluate the impact of adapting a language model to French clinical reports on downstream medical NLP tasks. Methods We leveraged a corpus of 21M clinical reports collected from August 2017 to July 2021 at the Greater Paris University Hospitals (APHP) to produce two CamemBERT architectures on speciality language: one retrained from scratch and the other using CamemBERT as its initialisation. We used two French annotated medical datasets to compare our language models to the original CamemBERT network, evaluating the statistical significance of improvement with the Wilcoxon test. Results Our models pretrained on clinical reports increased the average F1-score on APMed (an APHP-specific task) by 3 percentage points to 91%, a statistically significant improvement. They also achieved performance comparable to the original CamemBERT on QUAERO. These results hold true for the fine-tuned and from-scratch versions alike, starting from very few pre-training samples. Conclusions We confirm previous literature showing that adapting generalist pre-train language models such as CamenBERT on speciality corpora improves their performance for downstream clinical NLP tasks. Our results suggest that retraining from scratch does not induce a statistically significant performance gain compared to fine-tuning.


Development and Validation of an AI-driven Mammographic Breast Density Classification Tool Based on Radiologist Consensus

#artificialintelligence

Mammographic breast density (BD) is commonly visually assessed using the Breast Imaging Reporting and Data System (BI-RADS) four-category scale. To overcome inter- and intraobserver variability of visual assessment, the authors retrospectively developed and externally validated a software for BD classification based on convolutional neural networks from mammograms obtained between 2017 and 2020. The tool was trained using the majority BD category determined by seven board-certified radiologists who independently visually assessed 760 mediolateral oblique (MLO) images in 380 women (mean age, 57 years 6 [SD]) from center 1; this process mimicked training from a consensus of several human readers. External validation of the model was performed by the three radiologists whose BD assessment was closest to the majority (consensus) of the initial seven on a dataset of 384 MLO images in 197 women (mean age, 56 years 13) obtained from center 2. The model achieved an accuracy of 89.3% in distinguishing BI-RADS a or b (nondense breasts) versus c or d (dense breasts) categories, with an agreement of 90.4% (178 of 197 mammograms) and a reliability of 0.807 (Cohen κ) compared with the mode of the three readers. This study demonstrates accuracy and reliability of a fully automated software for BD classification.


Development and validation of multivariable machine learning algorithms to predict risk of …

#artificialintelligence

BMJ Open. 2022 Apr 1;12(4):e053590. doi: 10.1136/bmjopen-2021-053590.ABSTRACTOBJECTIVES: To develop and validate tests to assess the risk of any …

  Industry: Media > News (0.40)

Kaiser Permanente researchers push the envelope with AI and NLP

#artificialintelligence

Although healthcare is squarely in the era of big data and data analytics, it remains difficult in clinical research to accurately identify patients with complex conditions like valvular heart disease through medical records. And if researchers cannot identify these patients, they cannot study them, track practice patterns or conduct population management. Part of the problem is that the current methods used to identify highly specific conditions like valvular heart disease use diagnosis or procedure codes. These were created primarily for billing purposes and often are not very useful for clinical care because they can be quite nonspecific and not include detailed data about the condition. "For example, a patient with moderate or severe aortic stenosis, which is a narrowing of one of the primary heart valves, is entirely different than a patient with mild valve disease," said Dr. Matthew Solomon, a cardiologist at the Permanente Medical Group and a physician researcher at the Kaiser Permanente Division of Research in Oakland, California.